Skip to content

Conversation

@filip-michalsky
Copy link

@filip-michalsky filip-michalsky commented Jan 15, 2026

Description

Adds BrowserEnv - a unified browser automation integration for the verifiers library supporting two operational modes:

DOM Mode (mode="dom")

  • Uses the Stagehand Python SDK for natural language browser control
  • Tools: navigate, observe, act, extract - Stagehand's AI-driven primitives
  • Ideal for tasks that benefit from semantic understanding of page elements

CUA Mode (mode="cua")

  • Vision-based primitives for Computer Use Agent workflows
  • Tools: click, double_click, type_text, keypress, scroll, goto, back, forward, wait, screenshot
  • Requires companion TypeScript server (included) for CDP connection via Stagehand internals
  • Automatic screenshot management with context trimming for VLM input

Both modes support local browser execution or Browserbase cloud infrastructure.

What's included:

  • verifiers/envs/integrations/browser_env/ - Core integration (BrowserEnv, DOMMode, CUAMode)
  • verifiers/envs/integrations/browser_env/cua-server/ - TypeScript server for CUA mode
  • environments/browser_dom_example/ - Minimal DOM mode example
  • environments/browser_cua_example/ - Minimal CUA mode example
  • New [browser] extra: uv add 'verifiers[browser]'

Benchmarks (GAIA, WebVoyager, Mind2Web) have been pushed to Prime Hub under the browserbase/ namespace.

Type of Change

  • New feature (non-breaking change which adds functionality)

Testing

# DOM mode
prime eval run browserbase/browser-dom-example -m openai/gpt-4.1-mini

# CUA mode (start server first: cd verifiers/envs/integrations/browser_env/cua-server && ./start.sh)
prime eval run browserbase/browser-cua-example -m qwen/qwen3-vl-30b-a3b-instruct
  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes

Future work:

  • Compile CUA TypeScript server to binary to remove Node.js dependency
  • Additional benchmark environments available on Prime Hub under browserbase/ org
    ~

Note

Adds a new browser automation integration with two modes and supporting assets.

  • Introduces BrowserEnv (DOM via Stagehand, CUA via vision primitives) with default prompts, env var validation, tool handling, screenshot filtering, and mode-specific message formatting
  • Exposes BrowserEnv in verifiers/__init__.py and adds integration package under verifiers/envs/integrations/browser_env
  • Provides example environments: environments/browser_dom_example and environments/browser_cua_example (docs, datasets, loaders, pyprojects)
  • Bundles CUA server (verifiers/envs/integrations/browser_env/cua-server/) with Fastify/Stagehand code and scripts
  • Adds [browser] optional dependency group in pyproject.toml and updates integration docs (docs/environments.md, verifiers/envs/integrations/README.md, environments/AGENTS.md)
  • Adds comprehensive tests for modes, prompts, validation, DOM LLM config, screenshot filtering, example datasets, and updates tests/test_envs.py skip list

Written by Cursor Bugbot for commit e688da6. This will update automatically on new commits. Configure here.

@CLAassistant
Copy link

CLAassistant commented Jan 15, 2026

CLA assistant check
All committers have signed the CLA.

@filip-michalsky filip-michalsky changed the title ruff precommit Add Browser Env Integration Jan 15, 2026
@filip-michalsky filip-michalsky marked this pull request as ready for review January 16, 2026 16:02
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants